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* 



1 ATGGGGGAGA TGCAGGGCGC GCTGGCCAGA GCCCGGCTCG AGTCCCTGCT 
51 GCGGCCCCGC CACAAAAAGA GGGCCGAGGC GCAGAAAAGG AGCGAGTCCT 
101 TCCTGCTGAG CGGACTGGCT TTCATGAAGC AGAGGAGGAT GGGTCTGAAC 
151 GACTTTATTC AGAAGATTGC CAATAACTCC TATGCATGCA AACACCCTGA 
201 AGTTCAGTCC ATCTTGAAGA TCTCCCAACC TCAGGAGCCT GAGCTTATGA 
251 ATGCCAACCC TTCTCCTCCA CCAAGTCCTT CTCAGCAAAT CAACCTTGGC 
301 CCGTCGTCCA ATCCTCATGC TAAACCATCT GACTTTCACT TCTTGAAAGT 
351 GATCGGAAAG GGCAGTTTTG GAAAGGTTCT TCTAGCAAGA CACAAGGCAG 
401 AAGAAGTGTT CTATGCAGTC AAAGI 1 1 IAC AGAAGAAAGC AATCCTGAAA 
451 AAGAAAGAGG AGAAGCATAT TATGTCGGAG CGGAATGTTC TGTTGAAGAA 
501 TGTGAAGCAC CCTTTCCTGG TGGGCCTTCA CTTCTCTTTC CAGACTGCTG 
551 ACAAATTGTA CTTTGTCCTA GACTACATTA ATGGTGGAGA GTTGTTCTAC 
601 CATCTCCAGA GGGAACGCTG CTTCCTGGAA CCACGGGCTC GTTTCTATGC 
651 TGCTGAAATA GCCAGTGCCT TGGGCTACCT GCATTCACTG AACATCGTTT 
701 ATAGAGACTT AAAACCAGAG AATATTTTGC TAGATTCACA GGGACACATT 
751 GTCCTTACTG ACTTCGGACT CTGCAAGGAG AACATTGAAC ACAACAGCAC 
Uk 801 AACATCCACC TTCTGTGGCA CGCCGGAGTA TCTCGCACCT GAGGTGCTTC 
Q 851 ATAAGCAGCC TTATGACAGG ACTGTGGACT GGTGGTGCCT GGGAGCTGTC 
O 901 TTGTATGAGA TGCTGTATGG CCTGCCGCCT TTTTATAGCC GAAACACAGC 
?] 951 TGAAATGTAC GACAACATTC TGAACAAGCC TCTCCAGCTG AAACCAAATA 
j 1001 TTACAAATTC CGCAAGACAC CTCCTGGAGG GCCTCCTGCA GAAGGACAGG 
•H 1051 ACAAAGCGGC TCGGGGCCAA GGATGACTTC ATGGAGATTA AGAGTCATGT 
^ 1101 CTTCTTCTCC TTAATTAACT GGGATGATCT CATTAATAAG AAGATTACTC 
1151 CCCCTTTTAA CCCAAATGTG AGTGGGCCCA ACGACCTACG GCACTTTGAC 
□ 1201 CCCGAGTTTA CCGAAGAGCC TGTCCCCAAC TCCATTGGCA AGTCCCCTGA 
I'll 1251 CAGCGTCCTC GTCACAGCCA GCGTCAAGGA AGCTGCCGAG GCTTTCCTAG 
O 1301 GCTTTTCCTA TGCGCCTCCC ACGGACTCTT TCCTCTGA (SEQ ID NO:l) 

m 

|;P FEATURES : 

^ Start Codon: 1 
Stop Codon: 1336 



Homologous proteins: 
TOP 10 BLAST Hits 



CRA | 67000082668077 gi | 14756346 /def=ref | XP_037046 . 1 | (XM . . . 843 0.0 

CRA | 18000005074572 gi 15032091 /def=ref | NP_005618. 1| (NM_... 841 0.0 

CRA | 18000004907445 gi 1477098 /def=pir| |A48094 serum and . . . 829 0.0 

CRA 1 150000165029864 gi 1 13431833 /def=sp | Q9XT18 1 SGICRABTT. . . 826 0 . 0 

CRA | 18000005246968 gi 16755490 /def=ref | NP_035491. 1| (NM_... 824 0.0 

CRA 1 18000004937507 gi 1 9507093 /def =ref | NP_062105 . 1 1 (NM_. . . 822 0 . 0 

CRA 1 18000005171986 gi 1 3688803 /def=gb | AAC62398 . 1 1 (AF057 . . . 776 0.0 

CRA | 18000005144813 gi | 3116066 /def=emb | CAA11528 . 1 | (AJ22 . . . 711 0.0 

CRA | 18000005144812 gi | 3116064 /def=emb | CAA11527 . 1 | (AJ22 . . . 709 0.0 

CRA | 335001098677651 gi | 11321321 /def=gb | AAG34115 . 1 | AF312 ... 579 e-164 
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Blast hits to dbEST: 
CRA Number gi 



Number Score 



Expect 



'SI 

'Si 
'Si 

h 
ru 

■ 3 

m 
a 
m 



CRA 


63000075619018 


gi 


114816226 


1562 


0.0 




CRA 


225000014874111 


gi 


118504859 


1503 




0.0 


CRA 


225000001750119 


gi 


115756574 


1441 




0.0 


CRA 


159000009754651 


gi 


113582978 


1417 




0.0 


CRA 


158000041295522 


gi 


110994817 


1407 




0.0 


CRA 


58000099052833 


gi 


112793499 


1402 


0.0 




CRA 


66000078090204 


gi 


115017913 


1402 


0.0 




CRA 


225000014943770 


gi 


118509828 


1392 




0.0 


CRA 


63000075528266 


gi 


114809983 


1368 


0.0 




CRA 


78000169320891 


gi 


114073071 


1364 


0.0 




CRA 


335001063053989 


gi 


110937149 


1296 




0.0 


CRA 


78000106804089 


gi 


110390589 


1285 


0.0 




CRA 


118000029469319 


gi 


110933084 


1259 




0.0 


CRA 


11000545765847 


gi 


19176035 


1239 




0.0 


CRA 


222000003126349 


gi 


116520713 


1225 




0.0 


CRA 


158000041316197 


gi 


110996884 


1223 




0.0 


CRA 


55000120105106 


gi 


113994615 


1197 


0.0 




CRA 


222000003339745 


gi 


116529516 


1191 




0.0 


CRA 


78000169332857 


gi 


114074159 


1183 


0.0 




CRA 


224000000151825 


gi 


115161415 


1181 




0.0 


CRA 


158000041310407 


gi 


110996305 


1174 




0.0 


CRA 


224000004588220 


gi 


115950481 


1172 




0.0 


CRA 


98000052723591 


gi 


113976289 


1158 


0.0 




CRA 


222000000718505 


gi 


115248907 


1156 




0.0 


CRA 


78000105668173 


gi 


110352924 


1152 


0.0 




CRA 


164000029925315 


gi 


111001544 


1146 




0.0 


CRA 


164000029914485 


gi 


111000461 


1132 




0.0 


CRA 


107000020244468 


gi 


19330126 


1132 




0.0 


CRA 


55000120090850 


gi 


113993063 


1132 


0.0 




CRA 


158000041316657 


gi 


110996930 


1122 




0.0 


CRA 


78000106872608 


gi 


110398650 


1122 


0.0 




CRA 


165000138685182 


gi 


114292101 


1100 




0.0 


CRA 


164000029922335 


gi 


111001246 


1094 




0.0 


CRA 


335001063029672 


gi 


110947466 


1094 




0.0 


CRA 


113000083010569 


gi 


112041410 


1086 




0.0 


CRA 


222000001551530 


gi 


115446801 


1070 




0.0 


CRA 


78000105702017 


gi 


110357421 


1053 


0.0 




CRA 


107000020410692 


gi 


19345242 


1049 




0.0 


CRA 


107000020408053 


gi 


19345002 


1049 




0.0 


CRA 


196000006451565 


gi 


112101876 


1045 




0.0 


CRA 


154000034710284 


gi 


110331724 


1031 




0.0 


CRA 


58000099305099 


gi 


112895306 


1025 


0.0 
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CRA 1 222000001550936 gi 1 15446747 1021 0 . 0 

CRA | 78000105667113 gi 110352757 1017 0.0 

CRA | 3000001081732 gi 12877701 1009 0.0 

CRA | 63000075437648 gi 114804614 1003 0.0 

CRA | 114000006533071 gi | 8142744 985 0.0 

CRA | 149000089175289 gi | 13402435 981 0.0 

CRA | 155000041341862 gi | 10144531 977 0.0 

CRA 1 1000490895790 gi 15433043 954 0.0 



EXPRESSION INFORMATION FOR MODULATORY USE: 

library source: 



gi Number 



gi I 13994615 
gi I 10999153 
gi I 10947466 
gi I 10996930 
gi 1 15438670 



Organ 



Tissue Type 



brain 

PLACEl 
MAMMAl 
PLACEl 
brain 



hypothalamus 

placenta 

mammary gland 

placenta 
hippocampus 
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1 MGEMQGALAR ARLESLLRPR HKKRAEAQKR SESFLLSGLA FMKQRRMGLN 
51 DFIQKIANNS YACKHPEVQS ILKISQPQEP ELMNANPSPP PSPSQQINLG 
101 PSSNPHAKPS DFHFLKVIGK GSFGKVLLAR HKAEEVFYAV KVLQKKAILK 
151 KKEEKHIMSE RNVLLKNVKH PFLVGLHFSF QTADKLYFVL DYINGGELFY 
201 HLQRERCFLE PRARFYAAEI ASALGYLHSL NIVYRDLKPE NILLDSQGHI 
251 VLTDFGLCKE NIEHNSTTST FCGTPEYLAP EVLHKQPYDR TVDWWCLGAV 
301 LYEMLYGLPP FYSRNTAEMY DNILNKPLQL KPNITNSARH LLEGLLQKDR 
351 TKRLGAKDDF MEIKSHVFFS LINWDDLINK KXTPPFNPNV SGPNDLRHFD 
401 PEFTEEPVPN SIGKSPDSVL VTASVKEAAE AFLGFSYAPP TDSFL (SEQ ID NO: 2) 



FEATURES: 

Functional domains and key regions: 
Prosite results: 

PDOC00001 PS00001 ASNjGLYCOSYLATTON 

N-glycosylation site 
M Number of matches: 4 
O 1 58-61 NNSY 



%| PDOC00004 PS00004 CAMP_PHOSPHO_STTE 

cAMP- and cGMP-dependent protein kinase phosphorylation site 
0 380-383 KKXT 

ry 

3 PDOC00005 PS00005 PKC_PHOSPHO_3TTE 

Protein kinase C phosphorylation site 

Number of matches: 4 
,g 1 159-161 SER 

2 337-339 SAR 

3 351-353 TKR 

4 424-426 SVK 

PDOC00006 PS00006 CK2_PH0SPH0J5TTE 
Casein kinase II phosphorylation site 
424-427 SVKE 

PDOC00007 PS00007 TYR_PH05PH0_3TTE 
Tyrosine kinase phosphorylation site 

130-138 RHKAEEVFY 



:, 4 

'.0 
".I 



2 
3 
4 



265-268 
333-336 
389-392 



NSTT 
NTTN 
NVSG 
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PDOC00008 PS00008 MYRISTYL 
N-myristoylation site 

175-180 GLHFSF 

PDOO00100 PS00107 PROTEINJCINASE^ATP 
Protein kinases ATP-binding region signature 

118-150 IGKGSreKVLU^KAEEVFYAVKVLQKKAILK 

PDOC00100 PS00108 PROTEIN_KINASE_ST 
Serine/Threonine protein kinases active-site signature 
232-244 IVYRDLKPENILL 



Membrane spanning structure and domains: 
Helix Begin End Score Certainty 
1 293 313 0.948 Putative 

BLAST Alignment to Top Hit: 

>CRA 1 67000082668077 gi 114756346 /def=ref |XP_037046.1| 

(XM.037046) serum/glucocorticoid regulated kinase [Homo 
sapiens] /org=Homo sapiens /taxon=9606 /div=PRl 
/dataset=nraa /length=431 
Length = 431 

Score = 843 bits (2154), Expect = 0.0 

Identities = 406/407 (99%), Positives = 407/407 (99%) 

Frame = +1 

Query: 292 U\FWKQRRNIGLNDFIQI<IANNSY^ 471 

+AFMKQRIWSLNDFIQKIANNSYACK^ 
Sbjct: 25 TARvlKQRI^LM)FIQKIANNSYAa<HPEVQSILiaSQ^ 84 

Query: 472 LGPSSN PHAKPSDFH FLKVIGKGS FGKVLU\RHKAEEVFYAVKVLQKKAILKKKEEKHIM 651 

LGPSSNPHAKPSDmFLKVlGKGSFGKVLUXf^KAEEVFYAVKVL^ 
Sbjct: 85 LGPSSNPHAKPSDFH FLKVIGKGS FGKVLU^RHKAEEVFYAVKVLQKK^ 144 

Query: 652 SERNVLLI<IWKHPFLVGLHFSr^ADKLYFVLDYINGGELFYWLQRERCFLEPI^FYM 831 

SERNVLLKhJVKHPFLVGLHFSFQTADKLYFVUDYINGGELFYHLQRERCFLEPRARFYM 
Sbjct: 145 SERNVLLKNVKHPFLVGLHFSFQTADKLYFVUDYINGGELFYHLQRERCFLEPRARFYAA 204 

Query: 832 EIASALGYLHSLNIVYTCDLKPENILIJ)SQGHIVLTDFGLC^ 1011 

EIASALGYLHSLNIVYRDLKPENILIJ)SC^HIVLTDFGLCKENIEHNSTTSTFCGTPEYL 
Sbjct: 205 EIASALGYLHSLNIVYRDLKPENILLDSQGHIVLTDFGLCKENIEHNSTT5TFCGTPEYL 264 
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Query: 1012 APEVLHKQPYDRTVDWWCLGAVLYEMLYGL^ 1191 

APEVLHKQPYDRTVDWIflfCLGAVLYEMLYG^ 
Sbjct: 265 APEVLHKQPYDRTvtarVWCLGAVL^^ 324 

Query: 1192 RHLLEGLLQKDR7KRLGAKLX>FMEIKSHVFFSLI 1371 

WLLEGLLQKDRTKRLGAKTOFMEIKSHVFFSLIN^ 
Sbjct: 325 RHLLEGLLQKDRTKRLGAKTOFMEIKSHVFFS 384 

Query: 1372 fdpefteepvpnsigkspdsvlvtasvkeaaeaflgfsyapptdsfl 1512 

FDPEFTEEPVPNSIGKSPDSVLVTASVKEAAEAFLGFSYAPPTDSFL 
Sbjct: 385 FDPEFTEEPVPNSIGKSPDSVLVTASVKEAAEAFLGFSYAPPTDSFL 431 (SEQ ID NO:4) 



Hmmer search results (Pfam): 

Scores for sequence family classification (score includes all domains): 



Model 


Description 


Score 


E-value 


N 


PF0O069 


Eukaryotic protein kinase domain 


298.1 


l.le-85 


1 


PF0O433 


Protein kinase C terminal domain 


56.0 


5.6e-16 


1 


CEOO022 


CE00O22 MAGUK_subfamily_d 


24.7 


3.5e-07 


1 


CE00359 


E00359 bonejnorphogeneti c_protei n_receptor 


14.6 


0.0019 


2 


PF00787 


PX domain 


7.2 


2.4 


1 


CE00031 


CE00031 VEGFR 


0.6 


2.7 


1 


CE00292 


CE00292 PTK_memb rane_span 


-49.8 


3.8e-06 


1 


CE00287 


CE00287 PlKL_Eph_orphan_receptor 


-53.1 


7.6e-05 


1 


CE00289 


CE00289 PTK_PDGF_receptor 


-67.6 


0.37 


1 


CE00291 


CE00291 pnefgf-receptor 


-87.2 


0.00097 


1 


CE00286 


E00286 PTK_EGF_receptor 


-88.2 


le-05 


1 


CE00290 


CE00290 pnc_Trk_family 


-153.3 


9.1e-05 


1 


CE00016 


CE00016 GSK_glycogen_synthase_kinase 


-159.4 


8.3e-08 


1 



Parsed for domains: 



Model 


Domain 


sea-f 


sea-t 


hmm-f hrrm-t 




score 


E-value 


PF00787 


1/1 


40 


75 .. 


108 


147 


.] 


7.2 


2.4 


CE0O359 


1/2 


115 


143 .. 


144 


175 




0.5 


19 


CE00289 


1/1 


110 


212 .. 


1 


109 


ii 


-67.6 


0.37 


CE00031 


1/1 


216 


260 .. 


1051 


1095 




0.6 


2.7 


CE00359 


2/2 


232 


283 .. 


272 


327 




12.8 


0.0064 


CE00290 


1/1 


112 


341 .. 


1 


282 


ii 


-153.3 


9. le-05 


CE00291 


1/1 


112 


345 .. 


1 


285 


[] 


-87.2 


0.00097 


CE00286 


1/1 


112 


349 .. 


1 


263 


[] 


-88.2 


le-05 


CE0O292 


1/1 


112 


354 .. 


1 


288 


[] 


-49.8 


3.8e-06 


CE00022 


1/1 


223 


357 .. 


133 


275 




24.7 


3.5e-07 


CE00287 


1/1 


112 


367 .. 


1 


260 


ii 


-53.1 


7.6e-05 


PF00069 


1/1 


112 


369 .. 


1 


278 


[] 


298.1 


l.le-85 


CE00016 


1/1 


41 


442 .. 


1 


433 


[] 


-159.4 


8.3e-08 


PF00433 


1/1 


370 


444 .. 


1 


70 


[] 


56.0 


5.6e-16 
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1 TCTGGCTCGT GCTCTCATGT CATCTCAGAG TTCCAGCTTA TCAGAGGCAT 
51 GTAGCAGGGA GGCTTATTCC AGCCATAACT GGGCTCTACC TCCAGCCTCC 
101 AGAAGTAATC CCCAACCTGC ATATCCTTGG GCAACCCGAA GAATGAAAGA 
151 AGAAGCTATA AAACCCCCTT TGAAAGGTTC GTACTTACCG TACTATATTT 
201 TGCAGATGCC TCAAAGGATT TGGGGTTACT TGGCATGGGG AAGGCACATA 
251 AGGTGGGGTG TAGGAGAGGG TCTCTGGTTG TAGGTTTCTT AATTTAATGT 
301 TTGAAAACAA ACATGCAAAA GTCTGTGTGC AGGTTGATGT TTCTGGGCAG 
351 CCTGAGCAAA ATTTGCTCTC TCAAGAGGGA AAGGAACCAG GTGGGAGCAG 
401 AGCTAGGCTG GGCTAGGCTA GTTGAATGGT GGGACATGAC ATACGGGTGG 
451 CACTGGCAAT AACAAAGTCA CATTCTATGA AGATTCCCTG CAAGAGGAAG 
501 CAGACATGGG CCAGTTACTG TGATTTGAAA TTGCCTAAAC ATTGCTTTAG 
551 GTTGGCATGT CAATTTCAGG TACTAGTGTT I I I 1 1 IGI I I TTGTTTTTGT 
601 I I IGI I I I IG I I IGI I IGI I TGTTTTGAGA CGGAGTCTCG CTCTGTTGCC 
651 AGGCTGGAGT GCAGTGGCGT GATCTCGGCT CACTGCAACC TCCGCCTCCC 
701 GGGTTCAAGC GATTCTCCTG CCTCAGCCTC CCGAGTAACT GGGACTACAG 
751 GCGCACGCCA CCACGCCTGG CTAAI I I I IC TATTTTCAGT AGAGACGGGG 
801 TTTCACCATG TTAGCCAGGA TGGTCTCGAT CTCTTGACCT CGTGATCCGC 
851 CCGCCTTGGC CTCCCAAAGT GCTGGGATTA CAGGCGTGAG CCACTGCGCC 
901 CGGCCCCAGT AAATGCTTTT TATAAGTGTG GGCACTGAGC AAACTTTCCC 
951 AGCCAGACTC CAGGAGAGAG AATGTGTTTC CCTTCTCTCG GTTTGGGGCT 
1001 GTTGCAACAA AGCAAACCAA GGAGTTGAGA CTAGAGCTCA CTTTAGGGCA 
1051 AGTGGGGGTG GTTTTGCCTG CAAAACAAAC CCCTGCCCAA GACCAAGGAA 
1101 AAGGCGTTTC ACATGCTATT CCTGGTTTGA CAGCTGGTAT TTCGGGACTG 
1151 TGCCAGATCC AGTAGGCAAC TTTAAAATGG CAGAGCCTTT GGTAGCAAGA 
1201 GGTCATGGCA GGGCAGCCAC CGCAGACAGC AACAGCGAGC GCCAGGTACC 
1251 TGGCCCTGCG AATAGTGGTA ACTTGTAACT GCCCGCTCCG GGCCCAGTCG 
1301 CTGTGCTCGC GGCTTCCCGG CCAGCACTGG CTCACGTCCC CGCGCCGGCG 
1351 GTCAGGCTGC GGCTCCCAGA CATCCCCCAG CCGCGGGGTT ACTGGAAGGC 
1401 ACCGGCATCG CTGTTCTGCA GAGCCCGGGC CGCCGCCTCG AGCTTCCCTC 
1451 TCTTCCCTGC CTTCTGCAGC GGAGTCACCC GGCTAATCTT TCAGGATAAA 
1501 GTCACAGTTT ATGTGGGACT CACATAAAGA GCGAGCGAGG TGGCAAAACT 
1551 AAGAAGCCCT GGGGCAGCCT TGAGTTAAAC CCAGGGAGGG TAGGGACGAT 
1601 TTTAAGACCA TGTATCATGA CCTGCAGGGT TTTCAGGTGG GACAGCGGGA 
1651 GAGGAGCAGG CCCCACAGAG GAATCGAGGA TGCCCGGTTC ACGCCAGGTC 
1701 TGCCCCCGGG CAAAGCTACC CCTCCCTTCG CTTGTTACCT CCTCACGTGT 
1751 TCTTGGCATG GCAGAGATTA AAAATGCAAG GAAAAAAATT ACATGCGGAA 
1801 CGGACAAAAT GTTCTCAGAG ATTACTTCAG AAAAAAAAAA GTGAAATGCA 
1851 GATTGTACTT CTTCCTTTAG TGCAGAGACG ACTTTTATTT CCGCCCCCTC 
1901 CCCTCCACAT TCCTGACCTC TCCCTCCCCC TTTTCCCTCT TTCTTTCCTT 
1951 CCTTCCTCCT CTTCCAAGTT CTGGGATTTT TCAGCCTTGC TTGGTTTTGG 
2001 CCAAAAGCAC AAAAAAGGCG TTTTCGGAAG CGACCCGACC GTGCACAAGG 
2051 GCCAI I IGI I TGTTTTGGGA CTCGGGGCAG GAAATCTTGC CGGGCCTGAG 
2101 TCACGGCGGC TCCTTCAAGG AAACGTCAGT GCTCGCCGGT CGCTCTCGTC 
2151 TGCCGCGCGC CCCGCCGCCC GCTGCCCATG GGGGAGATGC AGGGCGCGCT 
2201 GGCCAGAGCC CGGCTCGAGT CCCTGCTGCG GCCCCGCCAC AAAAAGAGGG 
2251 CCGAGGCGCA GAAAAGGAGC GAGTCCTTCC TGCTGAGCGG ACTGGGTAAG 
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2301 CGCCGCCGCC GGCCCCGCTG GGGGCTTGGC TCACTTCCCC AGAGCGGCTT 
2351 GGAGGCAGGG GCCGGCTTTC GTCGGAGTTC TCGGGGCCGG GGTCCCGGCG 
2401 GCGGGAACGG GAGGACCTGG CGGGCGAGGT CGCGCGCGCA GGCCTGCGCC 
2451 CCAGGGATAA ACCCCGGAGG GTGGCGCGCA CCGCCGGCTC GGGTTGGGGA 
2501 GGAGGGTGGG AGTCCGGCCG CAGGACGGCG CCTGGCCGGG GAGAGGGTAT 
2551 CTGCAGGGAC AGTGAGCGAA GCCACCGTGG CCGCCGCGCA CCCGCCGGGA 
2601 AGCGCTTCGG CGCTGCGAAC CCGGCTTTCT CCGGCGGCGG AATAAATGAG 
2651 AGAGGTGGAA AACTACCCCG GGCTCTCCGG CCCTCCCCGC GCCCTCCGCC 
2701 GGCGCGTTCT CTCTCTCCTG CCCCAGGAGC CGATGGAGAC TGATAACGGC 
2751 CCTGCGCCAG GCCGTCCCCG GGCGGTCCTC GCGCCCCCGC CCGGGGCTCG 
2801 CCCTCTCAAT GGGGACAGAA CCGCCCGCCG CAGGCAGCGT AGCCGCCAGC 
2851 AAACCGCGAG GCGGTCGGGG CGGGGCGAGG GGCGAGGCGA AGGGCGGGGC 
2901 CACTTCTCAC TGTCGCGCAG GCCCCGCCCC CGCGGCGGTG CCI I I I I I AT 
2951 AAGGCCGAGC GCGCGGCCTG GCGCAGCATA CGCCGAGCCG GTCTTTGAGC 
3001 GCTAACGTCT TTCTGTCTCC CCGCGGTGGT GATGACGGTG AAAACTGAGG 
3051 CTGCTAAGGG CACCCTCACT TACTCCAGGA TGAGGGGCAT GGTGGCAATT 
!•* 3101 CTCATCGGTG AGTGCAGGAA TCTTGCGGGA CTTCTGCTCC AGGAGACGCA 
p 3151 AAGTGGAAAT I 1 1 I IGAAAG TCCCGGATCA GATTAGTGTG TGTGGCGCCG 
9 3201 GACGTTATGA AGCCGTCTAA ACGTTTCTTT ATTTCTCCTC CTTCATCCAC 
:; r | 3251 AGCTTTCATG AAGCAGAGGA GGATGGGTCT GAACGACTTT ATTCAGAAGA 
,2 3301 TTGCCAATAA CTCCTATGCA TGCAAACAGT AAGTTTGACC GGATTTGAGG 
■!j 3351 AAATAACTAG TATAGTTTGA AT7TGCCAGC GGTAAACATT CTCATCACGG 

,j 3401 CGTTTATCGG GAAGGCGAAG ACTTCTTCTG GGGTGGGGAT CTCATTTCTC 
3451 CTTAAATTCT AATATATTTG ACACATTTTA AACATTAAAG TTAATTTGCT 
O 3501 GATTTGGCTT GAACTGGAGA TGTAAGATAA ATGGTTCGTG TTGGCCGAAT 

I'D 3551 TCACGGCCTT TCTCCATGAG CAACAATCCT TATTTCTGTA TTTAATGGGG 

O 3601 TTTATTATTT TCTTTAACTG ACTAATGTAT TGGGGTATTT TCAGTTTAAA 

13 3651 CAGTGAATTA TCCGGGTAGA AGTCGGTAGA GCCAGGAAAC TCACTTTTGA 

Sfj 3701 TGTTGGTGTG CCCCCTAGTG GCGAGCTGGA TTCTAAATCG TGCCCTTTAT 

111 3751 TCCCTGCAGC CCTGAAGTTC AGTCCATCTT GAAGATCTCC CAACCTCAGG 

3801 AGCCTGAGCT TATGAATGCC AACCCTTCTC CTCCAGTAAG I I I I IGTATG 
3851 TGCCGTGCAT CTGTGGAGAA CTGTAAGGGA GTCAGTTAGT ATTCCTACAT 
3901 TAATGGATTA AAATAGCATT TCTAGAAATT AGTATCAAGG CAGGAATGCT 
3951 TCATTATGGC ATAACAAGTG ATATAAATAT TTAAGTATTG AGTCAGAGTA 
4001 TTATTTTATT I I I I ICCTGG GCATATTTTA CCTCCAAAGT GGTTATTTTA 
4051 AAAGGCATAT TTCATAAAAA GGTTTTATCT GTCTGAAACA ACATGACTGT 
4101 GTGCAGTTTC CATACTCATT TGAAATGTGA TGAAATGTAG TTTTGAATGT 
4151 TTATAGATGT ATGGTCATTT GCATCAGTCA TTTGTAGATG TAACATTTTC 
4201 TACATCGTTT ATGTTATAGA TGTCTTCCTT TGAAGCAATG GTATTAAAAG 
4251 AAATTCTTTT I I I I I I 1 1 IC TAGCCAAGTC CTTCTCAGCA AATCAACCTT 
4301 GGCCCGTCGT CCAATCCTCA TGCTAAACCA TCTGACTTTC ACTTCTTGAA 
4351 AGTGATCGGA AAGGGCAGTT TTGGAAAGGT AATTTCAAAT CTGAAGATCT 
4401 TTTGGTACAC TTCCTTCATG TCCTCTTTTA TATTCTCCCT GGATGAGGAT 
4451 AGAAAAATGA I I I I I I I AAA TTGAAATTTC AGGTTCTTCT AGCAAGACAC 
4501 AAGGCAGAAG AAGTGTTCTA TGCAGTCAAA GTTTTACAGA AGAAAGCAAT 
4551 CCTGAAAAAG AAAGAGGTAT GAGATGTGCT TGATGGGGCT GGCATTGGCG 
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4601 GTAGACACTC CTTGAATAAT CTTGATTCTG GAATGTTGGT GCCAAGTTGA 
4651 AACATGCCAC TAAATCTGAA TCGTCATTTT CCTAGGAGAA GCATATTATG 
4701 TCGGAGCGGA ATGTTCTGTT GAAGAATGTG AAGCACCCTT TCCTGGTGGG 
4751 CCTTCACTTC TCTTTCCAGA CTGCTGACAA ATTGTACTTT GTCCTAGACT 
4801 ACATTAATGG TGGAGAGGTG AGCAGGGGGG ATAGAAGTCA ACTCTTAGTG 
4851 TCTCTGCACA GCCTGCTTTG TTTTAGTiTG AGAAAAAAGT TTTCAAAGAT 
4901 TTTTGGTGGG GAGAATGTTA CCAGAATTAG CATTTCCTTC AACCTGTCAG 
4951 GTTTATAGTT AATAGATTAC TTGGGGCCAC TTCCTGCAGT TGTTCTTTTG 
5001 CTGTGTATGT CAAAACTAAT TAAATTCATT TGCAACCCAG AATGACTTTG 
5051 TTCTGTCTCC TGCAGTTGTT CTACCATCTC CAGAGGGAAC GCTGCTTCCT 
5101 GGAACCACGG GCTCGTTTCT ATGCTGCTGA AATAGCCAGT GCCTTGGGCT 
5151 ACCTGCATTC ACTGAACATC GTTTATAGGT AAGCCTGAGA GCTCTTCAGG 
5201 CTACCAGTTT TGGTATAAAG GAGACGTAGC ACTGGCTGTT TCATAGGGCC 
5251 TTAAAATAAT TTGTGTTTAT TTGCAACTTG GTTGCCTAAA ACCAGATCCC 
5301 CTAGCACGTG AGCTGGCTTG ACTTAAGTGC CAAGGGGGAA CCAGCCAAGT 
5351 AGGATTGTGC CTAATCCAGA ATAGATGAGC AGAACAAGGG CTCCCI 1 1 I I 
5401 TCTTCACTAC ACAACTACAG TGAACCTAAA ATGCCTCTAA TACCTTTAGC 
5451 AATTATCTTT AAGAGGATAT CTTATGAAGT GAAATTAACT TGTGCAACTA 
5501 CTTTTCTATT CACTTTTTTA CAGAGACTTA AAACCAGAGA ATATTTTGCT 
5551 AGATTCACAG GGACACATTG TCCTTACTGA CTTCGGACTC TGCAAGGAGA 
5601 ACATTGAACA CAACAGCACA ACATCCACCT TCTGTGGCAC GCCGGAGGTA 
5651 GGCGCTGTCT TGGTTTGGTG CCTGGTTTAC CCCCGCCTTC CAAGAGAGAG 
5701 ATGTACAATC ATGCACTTAA CTACCAAAAA GAGTAAACTC CTCTCAGAGA 
5751 CTTCTTAATA CAGTTCAGTG CAAATAAAAT ACATTTGCTG TTTGATGTAG 
5801 CATGAGAAAT CCCAAGTCCT TCTGTTCCTT TACTGAAAAG TAGCTGTTTG 
5851 TAAGTAAGAT CTGCATCATA AAAACTTTCT AAATCCCTAA GTAAGAGATA 
5901 TCAAGTGCCC AGCAGTTTCC TAAATGTCAG TACACATAGG TAGCCAGTCA 
5951 CCCTCAAAAA GTCCAGCAGT TTTATCAGGA AGGAATCTAA AGATATCTAT 
6001 CTTCCAAGCT GGCTCTGGGT CTCTCAGCTT TTTCAAACTA AATGTGTGGT 
6051 CGTGGGATTG CTTGCTTTCG CAGGTTCTAA ACGCTGTTTC CCTGGTCTGT 
6101 TTTTCAGTAT CTCGCACCTG AGGTGCTTCA TAAGCAGCCT TATGACAGGA 
6151 CTGTGGACTG GTGGTGCCTG GGAGCTGTCT TGTATGAGAT GCTGTATGGC 
6201 CTGGTGAGTG GCACATTGGG AACCATGGAA CACTGCCTGC TCCCTACAAT 
6251 ATTGCCTTCA CACAGCCCAT GCTTGGCCAT GGTGTCTTGC CCTTACCAGT 
6301 ACGCTTATCA AAAGCAGCTA AGAGGCATAT TGGTTATTTT ATAGTTCATA 
6351 AGAATAATCA CTTACCTGGT TCTTTTGTGC ATTTCACATT TTACTAGATA 
6401 GGACCACATT GAACCTGTGT GGTGGTGAAA AACTACCACT TATTAACATC 
6451 TACCCCCTCA CCCTCCACAC ACACACACAC AAACACACAC ACGGGTTGCA 
6501 AAGTAGACAC TTAAATAGCA AGGGAAAAGA AAGCATTGAG GTGGGGAGAG 
6551 TTTCTCAAAT CGAGCCTAAT ATTTATTGCC GTTTATATCT TTTTCTCTAC 
6601 TGGTAATGTG TGCCATATGA AACTTCCAAT TAAGTCTAAA GTAATTTTCC 
6651 CCTTCTTTCA GCCGCCTTTT TATAGCCGAA ACACAGCTGA AATGTACGAC 
6701 AACATTCTGA ACAAGCCTCT CCAGCTGAAA CCAAATATTA CAAATTCCGC 
6751 AAGACACCTC CTGGAGGGCC TCCTGCAGAA GGACAGGACA AAGCGGCTCG 
6801 GGGCCAAGGA TGACTTCGTG AGTGATGTTT TCCTGTCCTC CTGGGCCGGC 
6851 CGGGACGTGC ACTAGACCTC CCTGCCCTTA TTGAATGCAC CTGTCTAAAT 
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6901 TAATCTTGGG TTTCTTATCA ACAGATGGAG ATTAAGAGTC ATGTCI ICI I 
6951 CTCCTTAATT AACTGGGATG ATCTCATTAA TAAGAAGATT ACTCCCCCTT 
7001 TTAACCCAAA TGTGGTGAGT ATCTGTCTCT CTTCTAAGTA TAGAGAAGCC 
7051 CAAAGGGCAT TTATTTTAAT TCAGAATTGT CTGGGGGAGG GTTGGAAGGA 
7101 ATACATTGGC AGATGTTTTC TCCATAAACC TGTTATTTTA CCTACATAAA 
7151 AAGCACATTT TTGTGTCCCA ACAAGGCTCC CATAAI I I I I AGACACATTT 
7201 ATCAATTCGA AGCACCAAAA GGCAACAAGT GAACATTATT CTTATGTTTA 
7251 ACTGTGTGTA GCCTTTTGAG ATTTTGTGCT TGAAGTGGGT GATTATGGAA 
7301 GTTGATATAA GACTTAAACT TGGTATTTAA AGCCTGGTCA AGATTTCCCT 
7351 GTCCTGTGTC TAGTGTGAGT TCTTGACAAG AGTGTTTTTC CCTTCCCGTC 
7401 ACAGAGTGGG CCCAACGACC TACGGCACTT TGACCCCGAG TTTACCGAAG 
7451 AGCCTGTCCC CAACTCCATT GGCAAGTCCC CTGACAGCGT CCTCGTCACA 
7501 GCCAGCGTCA AGGAAGCTGC CGAGGCTTTC CTAGGCTTTT CCTATGCGCC 
7551 TCCCACGGAC TCTTTCCTCT GAACCCTGTT AGGGCTTGGT TTTAAAGGAT 
7601 TTTATGTGTG TTTCCGAATG TTTTAGTTAG CCTTTTGGTG GAGCCGCCAG 
7651 CTGACAGGAC ATCTTACAAG AGAATTTGCA CATCTCTGGA AGCTTAGCAA 
7701 TCTTATTGCA CACTGTTCGC TGGAAGCTTT TTGAAGAGCA CATTCTCCTC 
7751 AGTGAGCTCA TGAGGTTTTC ATTTTTATTC TTCCTTCCAA CGTGGTGCTA 
7801 TCTCTGAAAC GAGCGTTAGA GTGCCGCCTT AGACGGAGGC AGGAGTTTCG 
7851 TTAGAAAGCG GACGCTGTTC TAAAAAAGGT CTCCTGCAGA TCTGTCTGGG 
7901 CTGTGATGAC GAATATTATG AAATGTGCCT TTTCTGAAGA GATTGTGTTA 
7951 GCTCCAAAGC TTTTCCTATC GCAGTGTTTC AG I ICI I I AT TTTCCCTTGT 
8001 GGATATGCTG TGTGAACCGT CGTGTGAGTG TGGTATGCCT GATCACAGAT 
8051 GGATTTTGTT ATAAGCATCA ATGTGACACT TGCAGGACAC TACAACGTGG 
8101 GACATTGTTT Gl I ICI I CCA TATTTGGAAG ATAAATTTAT GTGTAGACTT 
8151 TTTTGTAAGA TACGGTTAAT AACTAAAATT TATTGAAATG GTCTTGCAAT 
8201 GACTCGTATT CAGATGCTTA AAGAAAGCAT TGCTGCTACA AATATTTCTA 
8251 1 1 1 1 1 AGAAA GGG1 I 1 1 IAT GGACCAATGC CCCAGTTGTC AGTCAGAGCC 
8301 GTTGGTGTTT TTCATTGTTT AAAATGTCAC CTGTAAAATG GGCATTATTT 
8351 ATGTTTTTTT TTTTGCATTC CTGATAATTG TATGTATTGT ATAAAGAACG 
8401 TCTGTACATT GGGTTATAAC ACTAGTATAT TTAAACTTAC AGGCTTATTT 
8451 GTAATGTAAA CCACCATTTT AATGTACTGT AATTAACATG GTTATAATAC 
8501 GTACAATCCT TCCCTCATCC CATCACACAA CI I I I I I IGT GTGTGATAAA 
8551 CTGATTTTGG TTTGCAATAA AACCTTGAAA AATATTTACA TATATTGTGT 
8601 CATGTGTTAT TTTGTATATT TTGGTTAAGG GGGTAATCAT GGGTTAGTTT 
8651 AAAATTGAAA ACCATGAAAA TCCTGCTGT A ATTTCCTGCT TAGTGGTTTG 
8701 CTCCCAACAG CAGTGGTTTC TGACTCCAGG GGAGTATAGG ATGGTCTTAA 
8751 AGCCAACCTA CGTTCCAGGC CI I I I IAGCA GCATTTTATG GTGTCTGTCA 
8801 TTCATAAATC CATCCAAGGA AATCCTTTGC AATTTACTCA TCTTGCAAGG 
8851 ATTGCTATGA AGTAATGCTT CCTGTATTTA TTGCCTGTCC TGTGAAGTTG 
8901 GACTATTTGT CCTGACATTT GGCTTGTCTT CAGTTACAGG TAATTCTTTC 
8951 CAGAAATATT TGAAAGCCTA CTCTGGGCTC TATTGCGAGT GCTCAGGATA 
9001 TCGTAGTGGA CAAAGCAGAC AACTTCGCCC TTCCAGAGCC TGATGAAGAA 
9051 GGCCGACCTA AAGCAGTTAG TTGAGATGGA AATTGAGAAA TAGTCTGTGA 
9101 AGTTTAGGAG AATGCCACAC AAGAGGGTGA GAAI I I I I I I I I I I I I I I I I 
9151 1 1 1 1 1 1 1 1 IG AGACACGGTC TTACTCTGTC GCCCAGGCTG GAGTGCAGTG 
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9201 GTGTGATCTT GGTTCACTGC AGCCTCCGCC TCCTGGGTTC ATGTGATCCC 
9251 CCCATCTCAG CCTCCTGAGT AGCTGGGACT ACAGGCATGC ACCACCATGC 
9301 CTGGCTAATT TTTGTATTTT AGTAGAGATG GGATTTCACC ATGTGGGCCA 
9351 GGCTGGTCTC GAATCCCTGG CCTCAAGTGA TCTGTCTGCC TCGGCTTCCC 
9401 TAAGTGCTGG GGAGAATGTT TTAAATAAGT GGATATGTTC CCAAAAAGCT 
9451 GACCTGGCTG GGACATCTGG TTTCTGAGAG TACCTGGAGT TGACCCAGGT 
9501 CTAGAGTGAG CTCAGTAAAG GGACCCTGAA GGAGCTCATC CCTAGCTTGG 
9551 ACTGAAGCTT CTTGAGCCAG TGTCTACCTA GCACCCTAAG GGCCCAGCAG 
9601 GCTCTGGGGC TGTGTGGCAG AGCCCACTCC TAGAGCTCAC CCCACTGTGA 
9651 TATTACCTGT GGGAGAAAGC GAGGTGGCAC CATCCTTGGA GATCTTGAGT 
9701 CCAAAGGTTT GGACTTTTTC ACTCTTCTAG GCCTTCCACA CAAATACTTA 
9751 ACAAATAATC AGGGAATCCC CAAACAGTTG ATGTTGCTGC TGCCTTAATT 
9801 GCAAAAGCAC CCTGTAGGCC TGCTGCACCC CCGCTACCCT GACCTTCCAG 
9851 TTCGCACAGG GATTTCCCCA AGGGAAAGCT GTGAGCTTTT TTCCTCTTAT 
9901 CCTTGCTCTT GGGTCTCACC TCACTTTGCC TCAGTCCCCC TCTCCTACCC 
9951 CACAAGGTTT CCAAGGGCCA AACAGGTGTT CAGAGATAAC CGAGTTCTTC 
10001 TCCCTCATGA TCTAATGAAG GAAGAAGATG AAAACGAGTC GATAGCTTTT 
10051 TGCTCAAGGT GGGCCACCGG TCATGCTCTG CTGTTGACTT ACTGCTCTAC 
10101 AGGCATTAGC TACGTGTTCA ATTCCCTACC GGGCCCAGTT GACAAATAAA 
10151 GAGTCCAAAG CAAGGCCAGG CACGGTGGCT CACGCTTGTA ATCCCAGCAC 
10201 TTTGGGAGGC CGAGGCGGGC AGATCACGAG GTCAGGAGAT CGAGACCATC 
10251 CTGGCTAACA TGGTGAAACC CCGTCTCTAC TAAAAATACA AAAAAATTAG 
10301 CCGGGCGTGG TGGTGGGCGC CTGTAGTCCC AGCTACTCGG GAGGCTGAGG 
10351 CAGGAGAATG GCGTGAACCA GGGAGGCGGA GCTTGCAGTG AGCCGAGATC 
10401 GCACCACTGC ACTCCAGCCT GGGCGACAGA GCAAGACTCT GTCTCAAAAA 
10451 ACAAAACAAA ACAAAAGCAT GTATTTTCCT ATTAAAGATT GATGCCGGCT 
10501 CTAACATAGA GACTCATTGC ATATTCCCCC TCATTCTCAT TCTCAATAAC 
10551 AGTTATGAAT TCCTCCTCGA ACA (SEQ ID NO: 3) 
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CHROMOSOME MAP POSITION: 

chromosome 6 



ALLELIC VARIANTS (SNPs) : 
DNA 

Position Major Minor Domain 
738 A G intron 

Context: 

DNA 

Position 

738 GAGATACGGGTGGCACTGGC^ 

AAGCAGAC^TGGGCCAGTTACTGTGATTTGAAATTGCCTAMC^ 
TGTCAATTTCAGGTACTAGTG I I I I I I I IGI I I I IU I I I IU I I IGI I I I IGI I IGI I I 
GTTTGTTI I GAGACGGAGTCTCGCTCTGTTGCCAGGCTGGAGTGCAGTGGCGTGATCTCG 
GCTCACTGCMCCTCCGCCTCCCGGGTTCAAGCGATTCTCCTG^ 
[A.G] 

CTGGGACTACAGGCGCACGC^CCACGCCTGGCTAAI I I I I CTATTTTCAGTAGAGACGG 

GGTTTCACCATGTTAGCCAGGATGGTCTCGATCTCrrGACCTCCT 

GCCTCCCAMGTGCTGGGATTACAGGCGTGAGCC^CTGCGCCCGG^ 

TTrATMGTGTGGGCACTGAGCAMCTITCCCAGCCAGACrCC^ 

TCCCTTCTCTCGGTTTGGGGCT 
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